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ABSTRACT 


In X-ray binary star systems consisting of a compact object that accretes 
material from an orbiting secondary star, there is no straightforward means to 
decide if the compact object is a black hole or a neutron star. To assist this 
process we develop a Bayesian statistical model which makes use of the fact 
that X-ray binary systems appear to cluster based on their compact object type 
when viewed from a 3-dimensional coordinate system derived from X-ray spectral 
data, where the hrst coordinate is the ratio of counts in mid to low energy band 
(color 1), the second coordinate is the ratio of counts in high to low energy 
band (color 2), and the third coordinate is the sum of counts in all three bands. 
Precisely, we use this model to estimate the probabilities that an X-ray binary 
system contains a black hole, non-pulsing neutron star or pulsing neutron star. 
In particular we utilize a latent variable model in which the latent variables 
follow a Gaussian process prior distribution, and hence we are able to induce 
the spatial correlation we believe exists between systems of the same type. The 
utility of this approach is evidenced by the accurate prediction of system types 
using Rossi X-ray Timing Explorer All Sky Monitor data, but it is not flawless. 
In particular, non-pulsing neutron systems containing “bursters” which are close 
to the boundary demarcating systems containing black holes tend to be classihed 
as black hole systems. As a byproduct of our analyses, we provide the astronomer 
with public R code that can be used to predict the compact object type of X-ray 
binaries given training data. 

Subject headings: methods: statistical; methods: data analysis; pulsars: general; stars: 
black holes; stars: neutron; X-rays: binaries 
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Introduction 


As our ability to acquire and archive data in all fields rapidly grows, the tools 
for searching these data for pattern, order, and ultimately meaning, need to grow 
commensurately. A critical issue in this ongoing paradigm shift is that of multivariate data 
with complex, hidden geometric structure. Color-color or CC diagrams (which provide 
spectral information over different energy ranges) and color-intensity or Cl diagrams (which 
show brightness variations for a given color) are common and easily obtained measurements 


that have long been used to classify X-ray binary types. White & Marshall (1984) plotted 


all X-ray binaries observed by the HEAO-1 satellite on one CC plot; they found that 
systems containing black holes clustered in one corner of their diagram and pulsars clustered 
in an opposing corner. While they found significant overlap of several classes of object in 
the center they were able to use this clustering to identify new BHC candidates. In Vrtilek 
& Boroson (2013; hereafter VB13) we show that when CC and Cl are combined into a three 
dimensional CCI plot, different types of X-ray binaries (XRBs) separate into complex but 
geometrically distinct volumes. VB13 model the volumes crudely by computing a centroid 
and constructing an ellipsoid around the centroid that contains 50 percent of all points 
while minimizing the volume of the ellipsoid. We suggest that these diagrams provide an 
easily used, model-independent way to separate classes of systems, in particular systems 
containing black holes from those containing neutron stars or systems that can produce jets 
from those that cannot. As a next step towards understanding the physical mechanisms 
behind this separation of compact object types we have developed a probabilistic (Bayesian) 
model which provides a supervised learning approach: unknown classifications of XRBs 
are predicted given known classifications. We provide the astronomer with R code that 
takes as input CCI data and outputs the estimated probabilities that a system is a black 
hole, pulsar, or non-pulsing X-ray binary system, in addition to standard errors for these 
estimates. This software provides the astronomer with more information than an off the 
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shelf machine learning solution because such a method typically produces point estimates 
for the classes of the observations, as opposed to an entire distribution for the classes of the 
observations. 

In Section 2 we describe the data used; in Section 3 we specify the models we have used 
for estimating the probabilities that the compact object type of an X-ray binary system is 
a black hole, non-pulsing neutron star, or pulsar. In Section 4 we present our results and 
their implications and in Section 5 we conclude with a summary and future directions for 
this work. 


Data 


Data on X-ray binaries were obtained by the All Sky Monitor (ASM) (Levine et al 


1996) on board NASA’s Rossi X-ray Timing Explorer (RXTE) which operated continuously 


for nearly fifteen years. Due to as yet uncalibrated gain changes in the instrument over 
the last two years of its life, we use only data obtained within the first 13 years. The MIT 
ASM team provides the data in three energy bins (1.3-3.0keV; 3.0-5.0keV; 5.0-12.0keV.) 
sampled 4-8 times a day. We take one day averages and dehne our colors as ratios of mid 
to low energy range (Cl) and high to low energy range (C2). The sum of the three energy 
bands is used to represent the intensity of the source; this value is normalized by dividing 
the total counts by the average of the top 1 percent of the data for any given source. These 
form a three-tuple consisting of the features, or background covariates in statistical terms, 
for each of the observations. Note that since these features are defined in terms of ratios 
of counts, they are unitless. We also restrict ourselves to detections that have a signal to 
noise of at least five, where signal to noise is dehned as the ratio of the number of counts in 
a particular bin to the error on the number of counts. Fig. 1 shows an example of a CCI 
diagram constructed with three types of X-ray binaries. 
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The classificaton of X-ray binaries is not simple and different authors tend to use 
different criteria. For our training set we used 24 systems whose classihcations are robust, 
that is they are consistent over numerous authors. In particular we hrst considered the 


classihcations from the catalogs of Liu, van Paradijs, & van den Heuvel (2001, 2006). We 


then used Remillard & McClintock (2006) identihcations of conhrmed black hole systems. 


Homan et ah 

(2010 

) identihcations of Z and Atoll sources, and 

Bildsten et al. 

(1997 


identihcations of accreting pulsars. We excluded any of the above types that were also 
identihed as bursters as these vary from author to author. This left us with 9 systems 
containing conhrmed black holes (Cyg X-1, LMC X-1 , J1118-I-480, J1550-564, J1650-500, 
J1655-40, GX 339-4, J1859+226, GRS 1915+105), 9 conhrmed pulsars (J0352+309, 
J1901+03, J1947+300, J2030+375, J1538-522, Gen X-3, Her X-1, SMG X-1, Vela X-1), and 
6 non-pulsing neutron star systems (Sco X-1, Gyg X-2, GX 17+2, GX 349+2, GX 9+1, GX 
9+9). 


We test our model by predicting the compact object type of three groups of systems. 
The hrst contains 6 systems whose type is unambiguously classihed: one conhrmed black 
hole (LMG X-3); one non-pulsing neutron star system (GX 5-1); and 4 pulsing neutron star 
systems (1744-28, 0656-072, 0535+262, 0115+634). The second group of systems contain 
stars that are classihed as both burster and atoll sources (Ser X-1, Aql X-1, 1916-053, 
1608-522, 1254-69, and 0614+091) and hence have possibly ambiguous classihcations. The 
third group of systems are sources that are either unclassihed or have multiple classihcations 
across various authors (1900-245, GX 3+1,1701-462, 1636-53, and 1700-37). 


The training data set consists of 40857 observations after the preprocessing steps 
delineated above, of which 13098 come from black hole systems, 25366 come from 
non-pulsing neutron star systems, and 2393 come from pulsing star systems. The imbalance 
in observations between diherent compact star types is due to the fact that brighter sources 
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are more likely to be detected than weak sources yet the brightness of sources is not uniform 
amongst star systems. See Figure 2 comparing brightness for various systems. 


Because observations that are less bright are inherently more likely to be below the 
signal to noise threshold that we utilized in preprocessing, the data are not missing at 
random ( Little fc Rubin||2002 ). The imbalance in observation types by system is problematic 
because visual inspection of systems of the same compact object type indicates substantial 
variability between systems; hence it is prudent to ensure that the true variation of CCI 
values for each compact object is accurately reflected in the training set. For instance, 
see Figure 3 for a visualization of the variance between systems that are black holes. 
Additionally the model we employ for generating multiple imputations, discussed in the 
next section, involves a Gaussian process and hence can be quite computationally expensive 
to work with, generally scaling with computations that take 0{N^) time where N is the 
number of data points in the data set. One solution to mitigate both of these issues is to 
subsample the training set where the probability that a particular observation is selected 
for inclusion in the smaller training set is inversely proportional to the total number of 
observations of its system in the entire training set. We sample 10 percent of the training 
data in this manner, without replacement, to achieve a hnal training set consisting of 4085 
observations, of which 1486 come from black hole systems, 1465 come from non-pulsing 
neutron star systems, and 1134 come from pulsing neutron star systems. The histograms in 
Figure 4 show the balance between various systems before and after subsampling. 


3. Models and Algorithm 

Our approach estimates the probabilities that the compact object type of an XRB is 
a black hole, non-pulsing neutron star, or pulsar from posterior predictions of the compact 
object type associated with CCI observations within the system. This approach is similar 
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to the multiple imputations methodology developed in the context of survey analysis, where 
the proper handling of missing data is crucial for obtaining valid statistical inferences, which 


is discussed in further detail by Rubin (1996). In summary, the multiple draws from the 
predictive distribution of compact object type allow the astronomer to make a judgement 
about her or his belief about what the compact object type of a given system is, and 
provide more information than a point prediction. The salient property of this approach 
is that it takes into account the inherent uncertainty in the prediction for each individual 
observation, and a concrete illustration of the output of this methodology is displayed in 


the Results section. For more on applied Bayesian data analysis, see Gelman et ah (2013). 


The astronomer’s probability model for the compact object of observations from a 
particular system is a trinomial model, where the relevant estimands are the probabilities 
that the compact object type is a black hole, non-pulsing neutron star, or pulsar. The 
astronomer’s objective is to estimate these probabilities given the compact object type 
for all observations within the system. Note that by employing a model which is not a 
constant, the astronomer is implicitly modeling “imputer noise”: in principle the compact 
object type of a system should be constant for all observations from that system, so the 
probability model employed by the astronomer is not physically correct, yet a pragmatic 
solution to allow for the (inevitable) mistakes made by an imputer. Indeed it would be 
unrealistic to assume any probabilistic model to be accurate all of the time. 


Next we describe the probabilistic (Bayesian) model used to generate predictions for 
the compact object type of CCI observations. We denote the training set as the 2-tuple 
(Strain, Ytrain) wliere Xtrain R an Ntrain by 3 matrix consisting of the three CCI values of 
the training points, and Ytrain is a length Ntrain vector of labels 1,2, or 3 corresponding 
to the compact object type of the system each individual data point comes from: more 
precisely 1 represents a black hole system, 2 represents a non-pulsing neutron star system. 








and 3 represents a pulsing system. The test set is denoted by Xpred which is an Npred by 
3 matrix that contains the CCI values of observations from a single system we would like 
to predict the compact object type of, and we write Ypred to indicate the labels for the 
observations from the system. The model aims to predict the unknown vector Ypred, and 
from this estimate the probabilities that the compact object type is a black hole, pulsar 
or non-pulsar. From this point of view, Ypred is the inferential object of interest and the 
remaining parameters discussed are nuisance parameters, meaning that they exist for the 
mathematics of the probabilistic model employed but are not of ultimate inferential interest 


We introduce three independent latent variables for each compact object type (black 
hole, non-pulsar, and pulsar) whose marginal distribution is a Gaussian process with mean 
0 and covariance matrix S which has a squared exponential kernel, a standard choice in 
the computer experiments and machine learning literature, which is discussed in detail by 


Rasmussen & Williams (2005), i.e. 


S,, = a2exp(-||W„-X,-.112/0) 


( 1 ) 


where denotes the ith row of the data matrix X for which the hrst Ntrain rows are 
the rows of Xtrain and the rows from Ntrain + 1 to Ntrain + Npred contain the rows of 
Xpred- Each latent variable is tied to the compact object type through a multinomial 
logistic link function: the probability of a particular observation k to be of a class I is 
proportional to exp[ai + PiZ^i], where ai and /3i have independent unit normal marginal 
distributions, and is the latent variable drawn corresponding to type 1. Note that 
Z_^i G where the hrst Ntrain elements correspond to the training points, Z,^t for 

short, and remaining elements correspond to the prediction test points, Z p for short. The 
parameter ai is the mean offset for the latent Gaussian process corresponding to compact 
object type /, and the parameter 0; is indicative of the marginal effect that each latent 
variable has on the propensity of a data point to be of type 1. It is important to recapitulate 
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that since we are interested in predicting the vector of compact object types Ypred, these 
additional parameters introduced {ai and /3;) in our model are ultimately marginalized out 
(i.e. forgotten) as nuisance parameters. The likelihood of the observed labels given the 
latent variables and remaining nuisance parameters takes a multinomial form: 

^train 

Z/(q;, / 3 , Zfj ^7*0,271) n exp + l3Y,rair., ] /^k (2) 

k=l 

Where the normalizing constant W is given by 

3 

Nk = '^explai + PiZt^^] (3) 

1=1 

For notational brevity, denote the posterior distribution of the vector Ypred, 
P*(Ypred\Ytrain, Strain, Xpred) as p*(Ypred) ■ Then by the definition of conditional prob¬ 
ability and marginalization we have the following integral representation for p*(Ypred)- 


P* (Ypred) 


' Zp,Zt,a,0 


p(Yp, Zp, Zt, a, f3\Yt, Xt, Xp)dZpZtdadf3 


( 4 ) 


' Zp,Zt,a,0 


p(Yp\Zp, a, 13, -)p(Zp\Zt, -)p(Zt, a, (3,\Yt, Xt, Xp)dZpdZtdad[3 (5) 


Note that we purposely overload the p(.) notation and use the dash symbol to represent 
“all other variables”, for less clutter. This decomposition suggests an iterative algorithm 
described in the appendix for sampling from the posterior distribution of Ypred- 


4. Results 

Here we present the prediction of the compact object types of those systems in the 
test set discussed in Section 2 using the model and algorithm discussed in Section 3. The 
predicted class of an XRB is the one with the maximum estimated probability, an approach 
which can be justified from a decision theoretic view point because the posterior mode is 
Bayes’ estimator under a 0-1 loss function, a reasonable loss function for a classification 
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problem. The class with the maximum estimated probability is mathematically equivalent 
to the class with the maximum number of posterior predictive draws for the class labels: 
i.e., the posterior predictive mode. 

Table 1 lists the predictions for the 6 systems whose classifications are known in 
addition to probability estimates and associated standard errors. Using this scheme there 
are no misclassihcations for this group of X-ray binary systems. Additionally in Table 2 
we include the predictions and probability estimates for “burster” non-pulsing systems, for 
which there are a number of wrong predictions: 4 out of 6 systems. In all cases these systems 
are mistaken for containing black holes. Visual inspection of the data is consistent with 
this result because the regions these systems occupy interferes with the region defined by 
systems containing black holes: for instance, consider Figure 5 comparing a burster system 
that is misclassihed as a black hole system to a burster system that is not misclassihed, 
where there appears to be signihcantly more overlap with the black hole system training 
data for the misclassihed system. It is also possible that some of these systems are 
misclassihed as non-pulsing neutron star systems in the literature, yet signihcantly more 
scientihc investigation must be performed in order to verify this possibility. Finally in Table 
3, we include predictions and probability estimates for unclassihed or ambiguously classihed 
XRB systems, for which notably GX 3-1-1 has a reasonably high estimated probability for 
being a non-pulsing neutron star system: .7674 with a standard error .0326. 

It is important to note that the entire distribution of posterior predictive draws provide 
signihcantly more spatial information than a point estimate for compact system type. For 
instance consider the systems Ser X-1 and Aql X-1, the hrst of which is a properly classihed 
non-pulsing system, and the second which is one that is improperly classihed as a black 
hole. From Figure 6 there seems to be no question that Ser X-1 is indeed a non-pulsing 
neutron star system since the proportion of posterior predictive draws is .9341 with a 
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standard error of .0133. On the other hand while Aql X-1 is not properly classihed as is 
evident in Fignre 7 we do see some signal for non-pnlsar evidenced by the .3093 estimated 
probability of being a non-pnlsing system, with standard error 0.0507. 


5. Summary and Future Directions 


The main objective of this work has been to develop a probabilistic model for predicting 
the compact object type of CCI observations from an XRB and to use the predictions 
generated from this model to estimate the probabilities that the compact object type is a 
black hole, non-pulsing neutron star, or pulsar. We have shown that the model we have 
developed works reasonably well for this purpose based on the accurate classihcation of well 
known X-ray binaries, but note that the model seems to make mistakes for the classihcation 
of bursters that are close to the boundary between black hole systems and non-pulsars in 
the CCI coordinate system. This suggests further investigation of these systems as well as 
rehnement of our approach, including the sampling of data, models, and algorithms used. 
It is also possible that some of these “burster” systems are inappropriately classihed, but 
more scientihc investigation must be made before such a claim can be vigorously asserted. 


In order to improve the predictive accuracy of our classihcation scheme, we can extend 
the imputation model by imposing a distribution on the Gaussian process parameters 
or else using a cross validation approach to hne tune these parameters. There is a 
growing literature on Gaussian process prediction that attempts to bypass the associated 
computational impediments and so we would like to investigate these methods and their 


potential application to this problem further, e.g. the INLA method introduced by Rue 


(2009). Additionally, the RXTE ASM data contains a large fraction of missing data due to 


the signal to noise threshold we employ and so we may consider applying the framework of 
Little fc Rubin|(2002) to model this missing data. The primary advantages of this approach 
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are that it utilizes a Bayesian model for generating imputations, which is consistent with 
our model for compact object prediction, and it may lead to more plausible predictions 
for the unknown compact object types. Additionally, we may want to consider different 
subsampling schemes besides the one we employ to ensure that they do not corrupt the 
inherent structure in the data set. Finally, in addition to predictive accuracy, to make the 
model more scientihcally relevant we may want to include physically meaningful parameters; 
the inference of such parameters may explain the scientihc reasons for the separation of 
observations into different regions by compact object type. 


The CCI method by definition uses measurements of X-ray intensity and color in two 
X-ray bands. This information will in general not only reflect on the properties of the 
source but also on the absorption of the intrinsic spectrum by the interstellar medium. 
ISM absorption will clearly affect the lowest energy band the most, and thus the soft 
color. However, at higher column densities, the hard color will be affected as well. We 
are developing a general method for correction of CCI plots given the sensitivity curve 
of an X-ray monitoring telescope and likely models of the spectral shape (Boroson et 
ah, in preparation). The eROSITA telescope developed at the Max Planck Institute for 
Extraterrestrial Physics and due to be launched in 2016 has 20 times the sensitivity of the 
ROSAT/ASM in the low energy band and will be particularly beneficial to study the ISM. 


(Merloni et ah 2012) 


We can extend our long-range study using data from past and present large held of 


view X-ray instruments such as MAXI (Matsouka et ah 2009), the HETE-WXM (Yoshida 


et al. 1995) and BeppSAX-WFC (Boella et al.||1997). Current and planned X-ray telescopes 


with high sensitivity such as Chandra (Weisskopf et al. 2002), XMM (Mason et al. 1995), 


and eROSITA (Merloni et al. 2012) will enable us to apply our methodology to XRBs of 


much lower luminosity. 






















13 


Finally, we reiterate that the R code we have written to make predictions for this 
analysis ought to be be applicable to other CCI data sets quite easily, and so have provided 
it for public use. 

The Harvard ICHASC is acknowledged for their helpful feedback. Additionally, GG 
and SDV would like to acknowledge partial support through a Smithsonian Institution 
GGPS grant to SDV. 

Facilities: Harvard-Smithsonian Genter for Astrophysics Harvard University Odyssey 
Supercomputer. 


A. Algorithm description 


As expounded upon by Gelman et ah (2013) in detail, prediction in the Bayesian 


paradigm essentially follows the following iterative scheme: first draw from the posterior 
distribution of model parameters and latent variables through a Monte Garlo simulation, 
and then draw from the predictive distribution of interest, which in our case is that of 
conditional on these draws. Adapting this general strategy for our problem. Equations 
(4) and (5) from Section 3 suggest the following iterative algorithm for sampling from the 
posterior predictive distribution for the compact object type. 


Sample from the posterior distribution p{Ztrain,<^, l3AXtrain,Xpred,ytrain) using 
elliptical slice sampling due to the joint multivariate normal distribution of 


{a, /3, Ztrain)- The method of elliptical slice sampling was introduced by Murray, 


Adams, & MacKay (2010). 


Sample the posterior latent variables at the prediction points, Zpred, us¬ 
ing the conditional multivariate normal distribution of p{Zpred\Ztrain,—), 
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which has mean 'Lpred,train'^ir]iin,trainZtrain and COVaiiance matrix T.pred,pred - 
^preddrain^ir]iin,train^train,pred due to fundamental properties of conditional MVN 
distributions^] 

• Sample from Yp^ed from a multinomial distribution conditional on the posterior latent 
draw of Zpred,Oi,(3, where as aforementioned the probability for Ypred^ to be of type I 
is proportional to exp[a; + (3iZpredk J- 


Additionally we set cr^ = 1, and 0 = 0.1. 


Elliptical slice sampling is a Monte Carlo algorithm developed to simulate from a 
posterior probability distribution where the prior distribution is jointly multivariate normal, 


a condition that holds in our model as discussed in Section 3. As explained by Murray, 


Adams, & MacKay (2010), this is a scenario where traditional Monte Carlo methods 


applied within a Bayesian context, such as Gibbs sampling or Metropolis-Bastings, perform 
poorly. Routines to implement elliptical slice sampling and draw from the posterior 
distribution of Zpred and Yp^ed were written in the R programming language using the 
Repp, ReppEigen and RcppArmadillo packages for the efficient inline implementations 


of linear algebraic routines in C-I--I- (R Core Team 2015 Bates & Eddelbuettel 2013 


Eddelbuettel||2013 ; Eddelbuettel fc Sanderson|||2014 Skylar et al.||2015 ). Additional packages 


used in the testing and development of this code were mvtnorm and MASS (Genz et ah 


2014 Venables & Ripley 2002). As discussed by Murray, Adams, & MacKay (2010), the 


computational impediments of elliptical slice sampling stem primarily from determining 


^Note that 'Zpred,train K the submatrix consisting of the rows Ntrain + 1 to Np^ed + Ntrain 
and columns 1 to Ntrain of S. The other submatrices of S used in the previous formula are 
analogously defined. 

^More judicious ways of selecting these parameters are discussed in Section 5. 




































the Cholesky decomposition of and inverting a multivariate normal covariance matrix. 
RcppEigen and RcppArmadillo provide efficient implementations for determining the 
Cholesky decomposition and performing matrix inversion that can be conveniently included 
directly within R code. This code, along with the RXTE ASM data, is freely available at 
https://github.com/ggopalan/XRay-Binary-Classification. 
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Fig. 1.— Visualization of RXTE ASM data for 24 XRBs over 13 years including only the 5cr 
threshold values. Each individual point is the one day average of the CCI data from one of 
the 24 systems. Black points are observations from black hole XRB systems, red points are 
observations from non-pulsing neutron star XRB systems, and blue points are observations 
from pulsing neutron star XRB systems. The general pattern is that observations from 
different system types separate geometrically in this CCI coordinate system. (Note that 
since CCI coordinates are dehned in terms of ratios of counts, they are unitless.) 










Intensities by System (Training) 



Fig. 2.— Visualization of intensities for observations by each of the 24 systems within the 
training data set illustrating substantial variability. Systems 1-9 are black hole systems, 10- 
15 are non-pulsing neutron star systems, and 16-24 are pulsing neutron star systems. This 
may explain the wide variability for the number of observations of each system above the 
signal to noise threshold, since fainter measurements tend to be noisier. 
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Fig. 3.— An illustration of the wide variability in CCI data between different systems that 
contain black holes, where each color represents data from a different system. 
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Number of Observations by System Before Subsampling Training Set 




Fig. 4.— A comparison of the distribution of the number of observations by system in the 
training set before and after subsampling 
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Fig. 5.— Left: An example of a burster system in blue, Aql X-1, improperly classified as 
a black hole system by the algorithm with comparison to black hole training data in red. 
Right: An example of a burster system in blue, Ser X-1, properly classihed as non-pulsing 
system by the algorithm with comparison to the black hole training data in red. 
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Fig. 6.— An example of a non-pulsing neutron star system Ser X-1 that is properly classified 
by the classiher. The left indicates all of the observations from Ser X-1 and the associated 
predictions of each observation, with the mode taken to be the prediction. The histogram 
on the right illustrate the probabilities estimated that Ser X-1 is of each of the three classes. 




Compact Object Type 


Fig. 7.— An example of a non-pulsing neutron star system Aql X-1 that is improperly 
classihed by the classiher, mistaken for a black hole system. There appears to be some 
signal for a non-pulsar system, however. 




















Table 1. Probability estimates and predictions for compact object type of previously 
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Table 2. Probability estimates and predictions for compact object type of “burster” XRBs 
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Table 3. Probability estimates and predictions for compact object type of unclassified or 
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